Superscalar Branch Instruction Processor
نویسندگان
چکیده
In this paper we describe the design of the branch unit that has been implemented in some models of the recently announced IBM AS/400 1. The branch unit we describe is a modification of the unit originally designed for the experimental IBM ESA/370 2 SCISM processor. The main feature of branch unit is its capability to remove branch instructions from the instruction stream dynamically and pre-process them before the branches enter the pipeline. This allows the processor to issue branch-less code from the instruction stack while the branches execute separately and in parallel with the processing of other instructions. The branch prediction in the SCISM processor is achieved by the tagging of instructions in the cache, while in AS/400 is achieved with a branch history table.
منابع مشابه
Delayed Branches Versus Dynamic Branch Prediction in a High- Performance Superscalar Architecture
While delayed branch mechanisms were popular with the designers of RISC processors, most superscalar processors deploy dynamic branch prediction to minimise run-time branch penalties. We propose a generalised branch delay mechanism that is more suited to superscalar processors. We then quantitatively compare the performance of our delayed branch mechanism with run-time branch prediction, in the...
متن کاملSensitivity Analysis of a Superscalar Processor Model
Superscalar processors obtain their performance by exploiting instruction level parallelism in programs. Their performance is therefore limited by characteristics of programs and the design of the processor. Due to the complexity involved, estimating the performance of any superscalar processor design is a difficult task. Quick prediction of performance improvement arising from architecture mod...
متن کاملSupport for Speculative Execution in High- Performance Processors
Superscalar and superpipelining techniques increase the overlap between the instructions in a pipelined processor, and thus these techniques have the potential to improve processor performance by decreasing the average number of cycles between the execution of adjacent instructions. Yet, to obtain this potential performance benefit, an instruction scheduler for this high-performance processor m...
متن کاملAdding Fast Interrupts to Superscalar Processors
The hardware cost of taking an interrupt is increasing as processors become more superscalar. Using FLIP, an aggressively superscalar processor which we have designed and tested in Verilog, we demonstrate that interrupts can be fast and inexpensive. We trace individual signals through FLIP’s pipeline stages to show that fast interrupts require negligible new hardware. Except for linkage informa...
متن کاملHydraScalar: A Multipath-Capable Simulator
Even sophisticated branch-prediction techniques necessarily suffer some mispredictions, and even relatively small mispredict rates hurt performance substantially in current-generation processors. This suggests the study of multipath execution, in which the processor simultaneously executes code from both the taken and not-taken outcomes of a branch. This paper describes HydraScalar, a simulator...
متن کاملModeled and Measured Instruction Fetching Performance for Superscalar Microprocessors
Instruction fetching is critical to the performance of a superscalar microprocessor. We develop a mathematical model for three different cache techniques and evaluate its performance both in theory and in simulation using the SPEC95 suite of benchmarks. In all the techniques, the fetching performance is dramatically lower than ideal expectations. To help remedy the situation, we also evaluate i...
متن کامل